Skip to content

feat(native): Add support for REST based remote function#23568

Merged
tdcmeehan merged 8 commits intoprestodb:masterfrom
Joe-Abraham:remote
Nov 18, 2025
Merged

feat(native): Add support for REST based remote function#23568
tdcmeehan merged 8 commits intoprestodb:masterfrom
Joe-Abraham:remote

Conversation

@Joe-Abraham
Copy link
Copy Markdown
Contributor

@Joe-Abraham Joe-Abraham commented Sep 2, 2024

Description

Adds support of REST based remote function to presto native.

The following features will be added in the subsequent PRs and is not addressed in this PR

Motivation and Context

Impact

  • New capability for remote function execution via REST protocol
  • No changes to existing public APIs or user-facing features
  • Minimal performance impact as registration is one-time per function location

Test Plan

  • Container-based integration tests with embedded function server
  • Multi-threaded concurrent registration and execution scenarios
  • HTTP response validation and error handling tests

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==
Prestissimo (Native Execution) Changes
* Add support for REST API for Remote Functions in native engine. [RFC-007](https://github.com/prestodb/rfcs/blob/main/RFC-0007-remote-functions.md)

Co-authored-by: Wills Feng wills.feng@ibm.com

@Joe-Abraham Joe-Abraham force-pushed the remote branch 6 times, most recently from 808f416 to 1c7ea23 Compare September 10, 2024 09:49
@Joe-Abraham Joe-Abraham force-pushed the remote branch 7 times, most recently from 86c1f8b to 3345b51 Compare September 17, 2024 14:30
@Joe-Abraham Joe-Abraham force-pushed the remote branch 3 times, most recently from 0d73f09 to f530b51 Compare September 26, 2024 08:08
@Joe-Abraham Joe-Abraham force-pushed the remote branch 10 times, most recently from 32a8715 to 5726f00 Compare October 16, 2024 06:45
@Joe-Abraham Joe-Abraham force-pushed the remote branch 4 times, most recently from a8afff9 to 26f9fe6 Compare October 29, 2024 06:06
@steveburnett
Copy link
Copy Markdown
Contributor

Thanks for the release note! Please format it so the automation picks it up, like:

== NO RELEASE NOTE ==

Copy link
Copy Markdown
Contributor

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there.

Comment thread presto-native-execution/presto_cpp/main/common/Utils.cpp
Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

Comment thread presto-native-execution/presto_cpp/main/functions/remote/client/CMakeLists.txt Outdated
@aditi-pandit
Copy link
Copy Markdown
Contributor

@Joe-Abraham : Please also work on a follow-up PR for user documentation on how to setup and use rest functions in Native SQL engine queries.

Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

Comment thread presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.h Outdated
RemoteDoubleDivHandler(
velox::RowTypePtr inputTypes,
velox::TypePtr outputType)
: RemoteFunctionRestHandler(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit counterintuitive to me. So the function body assumes this function is for (DOUBLE(), DOUBLE()) -> DOUBLE()) which means that RemoteDoubleDivHandler shouldn't be taking types as input but only calling RemoteFunctionRestHandler with the correct types.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handler is hardcoded to work with a signature of (DOUBLE, DOUBLE) -> DOUBLE for testing purposes. While it could be extended to support generic numeric types (e.g., using templates or type checking), since that’s not the primary focus of these tests, we maintain the hardcoded types for simplicity.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems like a misunderstanding here. I feel the code for this and other functions should be as follows since the function knows what types it supports. The client shouldn't have to specify the types in the register functions.

  RemoteDoubleDivHandler()
      : RemoteFunctionRestHandler(
         std::vector<TypePtr> inputTypes = {DOUBLE(), DOUBLE()};
         TypePtr outputType = DOUBLE();
            std::move(inputTypes),
            std::move(outputType)) {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I have made improvements to the RemoteFunctionRestHandler.

assertEqualVectors(expected, results);
}

TEST_P(RemoteFunctionRestTest, removeCharactersFromString) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : Shorten function names : removeChar

assertEqualVectors(expected, results);
}

TEST_P(RemoteFunctionRestTest, inverseCdfInvalidInputsServerThrowsException) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : Shorten inverseCdfException

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/tests/RestFunctionHandleTest.cpp:121-116` </location>
<code_context>
+TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithDecimalType) {
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for edge cases with decimal types, such as negative scales, very large precisions, or invalid type strings.

Additional tests for invalid decimal type strings, negative scales, and large precisions will help verify robust parsing and error handling.

Suggested implementation:

```cpp
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithDecimalType) {
  try {
    const std::string str = R"JSON(
{
  "@type": "RestFunctionHandle",
  "functionId": "remote.testSchema.decimalFunction;decimal(10,2);decimal(10,2)",
  "version": "v1",
  "signature": {
    "name": "decimalFunction",
    "kind": "SCALAR",
    "returnType": "decimal(10,2)",

```

```cpp
  } catch (const std::exception& e) {
    FAIL() << "Exception: " << e.what();
  }
}

// Edge case: negative scale
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithNegativeScale) {
  const std::string str = R"JSON(
{
  "@type": "RestFunctionHandle",
  "functionId": "remote.testSchema.decimalFunction;decimal(10,-2);decimal(10,-2)",
  "version": "v1",
  "signature": {
    "name": "decimalFunction",
    "kind": "SCALAR",
    "returnType": "decimal(10,-2)",
    "arguments": [
      {
        "name": "arg0",
        "type": "decimal(10,-2)"
      }
    ]
  }
}
)JSON";
  try {
    auto handle = RestFunctionHandle::fromJson(str);
    FAIL() << "Expected exception for negative scale, but parsing succeeded.";
  } catch (const std::exception& e) {
    SUCCEED();
  }
}

// Edge case: very large precision
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithLargePrecision) {
  const std::string str = R"JSON(
{
  "@type": "RestFunctionHandle",
  "functionId": "remote.testSchema.decimalFunction;decimal(1000,2);decimal(1000,2)",
  "version": "v1",
  "signature": {
    "name": "decimalFunction",
    "kind": "SCALAR",
    "returnType": "decimal(1000,2)",
    "arguments": [
      {
        "name": "arg0",
        "type": "decimal(1000,2)"
      }
    ]
  }
}
)JSON";
  try {
    auto handle = RestFunctionHandle::fromJson(str);
    // Depending on implementation, this may succeed or fail.
    // If large precision is not supported, expect an exception.
    // Otherwise, check the type.
    auto returnType = handle->signature().returnType();
    EXPECT_EQ(returnType->kind(), TypeKind::DECIMAL);
    EXPECT_EQ(returnType->asDecimal().precision(), 1000);
    EXPECT_EQ(returnType->asDecimal().scale(), 2);
  } catch (const std::exception& e) {
    // Acceptable if implementation throws for large precision.
    SUCCEED();
  }
}

// Edge case: invalid type string
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithInvalidTypeString) {
  const std::string str = R"JSON(
{
  "@type": "RestFunctionHandle",
  "functionId": "remote.testSchema.decimalFunction;decimal(foo,bar);decimal(foo,bar)",
  "version": "v1",
  "signature": {
    "name": "decimalFunction",
    "kind": "SCALAR",
    "returnType": "decimal(foo,bar)",
    "arguments": [
      {
        "name": "arg0",
        "type": "decimal(foo,bar)"
      }
    ]
  }
}
)JSON";
  try {
    auto handle = RestFunctionHandle::fromJson(str);
    FAIL() << "Expected exception for invalid decimal type string, but parsing succeeded.";
  } catch (const std::exception& e) {
    SUCCEED();
  }
}

```
</issue_to_address>

### Comment 2
<location> `presto-native-execution/presto_cpp/main/types/tests/RestFunctionHandleTest.cpp:180-116` </location>
<code_context>
+TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayType) {
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests for empty arrays and arrays with nested nulls.

Please include test cases for empty arrays and arrays containing null elements to verify correct handling by the parser.

Suggested implementation:

```cpp
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayType) {
  try {
    const std::string str = R"JSON(
        {
          "@type": "RestFunctionHandle",
          "functionId": "remote.testSchema.arrayFunction;array(bigint);array(varchar)",
          "version": "v1",
          "signature": {
            "name": "arrayFunction",
            "kind": "SCALAR",
            "returnType": "array(bigint)",

```

```cpp
  } catch (const std::exception& e) {
    FAIL() << "Exception: " << e.what();
  }
}

// Test for empty array argument
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithEmptyArray) {
  try {
    const std::string str = R"JSON(
        {
          "@type": "RestFunctionHandle",
          "functionId": "remote.testSchema.emptyArrayFunction;array(bigint)",
          "version": "v1",
          "signature": {
            "name": "emptyArrayFunction",
            "kind": "SCALAR",
            "returnType": "array(bigint)",
            "argumentTypes": ["array(bigint)"]
          },
          "arguments": [
            []
          ]
        }
    )JSON";

    auto handle = RestFunctionHandle::fromJson(str);
    ASSERT_NE(handle, nullptr);
    ASSERT_EQ(handle->signature().name, "emptyArrayFunction");
    ASSERT_EQ(handle->arguments().size(), 1);
    // Check that the argument is an empty array
    auto* arrayArg = dynamic_cast<const ArrayArgument*>(handle->arguments()[0].get());
    ASSERT_NE(arrayArg, nullptr);
    EXPECT_TRUE(arrayArg->values().empty());
  } catch (const std::exception& e) {
    FAIL() << "Exception: " << e.what();
  }
}

// Test for array with nested nulls
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayContainingNulls) {
  try {
    const std::string str = R"JSON(
        {
          "@type": "RestFunctionHandle",
          "functionId": "remote.testSchema.nullArrayFunction;array(bigint)",
          "version": "v1",
          "signature": {
            "name": "nullArrayFunction",
            "kind": "SCALAR",
            "returnType": "array(bigint)",
            "argumentTypes": ["array(bigint)"]
          },
          "arguments": [
            [1, null, 3, null]
          ]
        }
    )JSON";

    auto handle = RestFunctionHandle::fromJson(str);
    ASSERT_NE(handle, nullptr);
    ASSERT_EQ(handle->signature().name, "nullArrayFunction");
    ASSERT_EQ(handle->arguments().size(), 1);
    auto* arrayArg = dynamic_cast<const ArrayArgument*>(handle->arguments()[0].get());
    ASSERT_NE(arrayArg, nullptr);
    ASSERT_EQ(arrayArg->values().size(), 4);
    // Check for nulls in the array
    EXPECT_TRUE(arrayArg->values()[1] == nullptr);
    EXPECT_TRUE(arrayArg->values()[3] == nullptr);
    EXPECT_EQ(*static_cast<const int64_t*>(arrayArg->values()[0].get()), 1);
    EXPECT_EQ(*static_cast<const int64_t*>(arrayArg->values()[2].get()), 3);
  } catch (const std::exception& e) {
    FAIL() << "Exception: " << e.what();
  }
}

```
</issue_to_address>

### Comment 3
<location> `presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoContainerRemoteFunction.java:42-43` </location>
<code_context>
+                true);
+    }
+
+    @Test
+    public void testRemoteBasicTests()
+    {
+        assertEquals(
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for remote functions with invalid arguments or unsupported types.

Please include test cases that verify error handling when remote functions receive invalid arguments or unsupported types.
</issue_to_address>

### Comment 4
<location> `presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoContainerRemoteFunction.java:71-84` </location>
<code_context>
+    }
+
+    @Test
+    public void testRemoteFunctionAppliedToColumn()
+    {
+        assertEquals(computeActual("SELECT remote.default.floor(o_totalprice) FROM tpch.sf1.orders")
+                .getMaterializedRows().size(), 1500000);
+        assertEquals(computeActual("SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem")
+                .getMaterializedRows().size(), 6001215);
+        assertQueryWithSameQueryRunner(
+                "SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem",
+                "SELECT abs(l_discount) FROM tpch.sf1.lineitem");
+        assertEquals(computeActual("SELECT remote.default.length(CAST(o_comment AS VARBINARY)) FROM tpch.sf1.orders")
+                .getMaterializedRows().size(), 1500000);
+    }
+}
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for remote functions on empty tables or with null values.

Please include tests for remote functions on empty tables and columns containing nulls to verify correct behavior in these cases.

```suggestion
    @Test
    public void testRemoteFunctionAppliedToColumn()
    {
        assertEquals(computeActual("SELECT remote.default.floor(o_totalprice) FROM tpch.sf1.orders")
                .getMaterializedRows().size(), 1500000);
        assertEquals(computeActual("SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem")
                .getMaterializedRows().size(), 6001215);
        assertQueryWithSameQueryRunner(
                "SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem",
                "SELECT abs(l_discount) FROM tpch.sf1.lineitem");
        assertEquals(computeActual("SELECT remote.default.length(CAST(o_comment AS VARBINARY)) FROM tpch.sf1.orders")
                .getMaterializedRows().size(), 1500000);
    }

    @Test
    public void testRemoteFunctionOnEmptyTable()
    {
        // Create an empty table for testing
        computeActual("CREATE TABLE test_empty_table (x INTEGER)");
        // Query remote function on empty table
        assertEquals(computeActual("SELECT remote.default.abs(x) FROM test_empty_table").getMaterializedRows().size(), 0);
        // Drop the table after test
        computeActual("DROP TABLE test_empty_table");
    }

    @Test
    public void testRemoteFunctionOnNullValues()
    {
        // Create a table with null values
        computeActual("CREATE TABLE test_null_table (y INTEGER)");
        computeActual("INSERT INTO test_null_table VALUES (NULL), (NULL), (1), (NULL)");
        // Query remote function on column with nulls
        List<?> result = computeActual("SELECT remote.default.abs(y) FROM test_null_table").getMaterializedRows();
        // There should be 4 rows, 3 nulls and 1 with value 1
        int nullCount = 0;
        int oneCount = 0;
        for (Object row : result) {
            Object value = ((com.facebook.presto.spi.Page) row).getField(0);
            if (value == null) {
                nullCount++;
            } else if (value.toString().equals("1")) {
                oneCount++;
            }
        }
        assertEquals(nullCount, 3);
        assertEquals(oneCount, 1);
        // Drop the table after test
        computeActual("DROP TABLE test_null_table");
    }
}
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@steveburnett
Copy link
Copy Markdown
Contributor

Thanks for the release note! Formatting nit because the Presto documentation uses .rst and not Markdown format links.

== RELEASE NOTES ==
Prestissimo (Native Execution) Changes
* Add support for REST API for Remote Functions in native engine. See `RFC-007 <https://github.com/prestodb/rfcs/blob/main/RFC-0007-remote-functions.md>`_.

Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a minor nit.

@czentgr, @majetideepak : Please can you do a round of review.

Comment thread presto-native-execution/presto_cpp/main/common/Utils.cpp Outdated
Comment thread presto-native-execution/presto_cpp/main/common/Utils.cpp Outdated
aditi-pandit
aditi-pandit previously approved these changes Nov 14, 2025
Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham.

Copy link
Copy Markdown
Contributor

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnaks, one nit to remove some strings with the corresponding existing constants.

}

auto msg = fmt::format(
"Unsupported Content-Type: '{}'. Expecting 'application/X-presto-pages' "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we have constants CONTENT_TYPE_PRESTO_PAGE and CONTENT_TYPE_SPARK_UNSAFE_ROW that can be used here and below.

Copy link
Copy Markdown
Contributor

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Joe-Abraham

@karthikeyann
Copy link
Copy Markdown
Contributor

target presto-coordinator: failed to solve: failed to compute cache key: failed to calculate checksum of ref 6e508995-4f7f-42ba-86ea-aaad822ca674::vhmyhjt8k5bcql3reowkfof8n: "/presto-function-server-executable.jar": not found

during docker build of native execution

@Joe-Abraham
Copy link
Copy Markdown
Contributor Author

target presto-coordinator: failed to solve: failed to compute cache key: failed to calculate checksum of ref 6e508995-4f7f-42ba-86ea-aaad822ca674::vhmyhjt8k5bcql3reowkfof8n: "/presto-function-server-executable.jar": not found

during docker build of native execution

@karthikeyann, I have updated the Jenkinsfile - #26662

@karthikeyann
Copy link
Copy Markdown
Contributor

This failure is during coordinator image build locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants